Skip to content

Record: GDN-Hybrid + Sliding Window Attention (cold-cache, 1.01710 BPB)#1564

Closed
joshkmartinez wants to merge 4 commits intoopenai:mainfrom
joshkmartinez:submission-run039-safe019-1.0171
Closed

Record: GDN-Hybrid + Sliding Window Attention (cold-cache, 1.01710 BPB)#1564
joshkmartinez wants to merge 4 commits intoopenai:mainfrom
joshkmartinez:submission-run039-safe019-1.0171

Conversation

@joshkmartinez
Copy link
Copy Markdown

Summary

3-Seed Results

Seed Steps EMA BPB val_bpb XSA BPB Artifact bytes
314 2223 1.007670 1.016476 1.020950 15,522,111
777 2239 1.007192 1.016192 1.020919 15,814,260
2718 2240 1.009535 1.018633 1.023874 15,981,262
Mean 1.008132 1.01710033 1.02191433 15,772,544.33
Std (sample) 0.00133490

Architecture

This submission uses an SP1024-tokenized GDN-Hybrid backbone with the following high-level structure:

[GDN×5] → SWA → [GDN×5] → SWA_shared

Key components:

  1. SP1024 tokenizer
  2. Gated DeltaNet hybrid backbone
  3. Sliding-window attention side path
  4. MuonEq-R + AdamW training mix
  5. EMA = 0.997
  6. Late QAT threshold = 0.15
  7. GPTQ int6 + zstd-22 packaging

Legality

This is a SAFE_SUBMISSION / Track-A fixed-predictor result. The scored artifact uses no TTT, no SLOT, and no eval-time adaptation. All three pulled artifacts are under the 16,000,000-byte cap.

Credits

sunnypatneedi pushed a commit to sunnypatneedi/parameter-golf that referenced this pull request Apr 12, 2026
…1.01710

Merged SOTA changed from 1.1147 to 1.0810 (PR openai#1493, bigbag, 2026-04-09).
Six PRs merged in 5 days (PRs openai#1334, openai#1285, openai#1394, openai#1412, openai#1413, openai#1477, openai#1493).
New target: ≤1.0760 val_bpb. 18 days to deadline.

Key findings:
- GDN-Hybrid (PR openai#1564): 1.01710 BPB, no TTT/SLOT — monitor for organizer review
- VarLen Attention + Doc-TTT (PR openai#1560): 1.07406 BPB — implement next
- TMA Megakernel + Tap-In (PR openai#1555): 1.07636 BPB — add after openai#1560
- PR openai#731 n-gram (dense count + Laplace): reviewer says LOOKS CLEAN, awaiting 3rd seed
- PR openai#758: major legality flags, do not implement

Updated CLAUDE.md: Competition Strategy, Technique Reference, Lessons Learned (Session 9).
Updated logs/daily_research.md: new 2026-04-12 entry prepended.

https://claude.ai/code/session_011WyxjcwdigLhMFQDjLL5ss
@joshkmartinez
Copy link
Copy Markdown
Author

Superseded by PR #1575, which stages the stronger run051-safe031 SAFE_SUBMISSION artifact (1.01671233 BPB, all seeds under cap).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant